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We introduce a new method for forecasting emergency call arrival 
rates that combines integer-valued time series models with a dynamic 
^£^ ' latent factor structure. Covariate information is captured via sim- 

ple constraints on the factor loadings. We directly model the count- 
valued arrivals per hour, rather than using an artificial assumption of 
Q, I normality. This is crucial for the emergency medical service context, 

.^^ ■ in which the volume of calls may be very low. Smoothing splines are 

. ' used in estimating the factor levels and loadings to improve long- 

j^ , term forecasts. We impose time series structure at the hourly level, 

rather than at the daily level, capturing the fine-scale dependence in 
addition to the long-term structure. 

Our analysis considers all emergency priority calls received by 
Toronto EMS between January 2007 and December 2008 for which 
^ I an ambulance was dispatched. Empirical results demonstrate signif- 

O^ . icantly reduced error in forecasting call arrival volume. To quantify 

the impact of reduced forecast errors, we design a queueing model 
simulation that approximates the dynamics of an ambulance system. 
The results show better performance as the forecasting method im- 
["*"■ ' proves. This notion of quantifying the operational impact of improved 

>— ^ , statistical procedures may be of independent interest. 



1. Introduction. Considerable attention has been paid to the problem 
of how to best deploy ambulances within a municipality to minimize their 
^ , response times to emergency calls while keeping costs low. Sophisticated 

C^ ' operations research models have been developed to address issues such as 

the optimal number of ambulances, where to place bases, and how to move 
ambulances in real time via system-status management [Swersey (1994); 
Goldberg (2004); Henderson (2009)]. However, methods for estimating the 



Received July 2010; revised November 2010. 
^Supported in part by NSF Grant CMMI-0926814. 

Key words and phrases. Ambulance planning, dynamic factor model, nonhomogeneous 
Poisson process, integer-valued time series, smoothing splines. 



This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Applied Statistics, 
2011, Vol. 5, No. 2B, 1379-1406. This reprint differs from the original in 
pagination and typographic detail. 

i 



2 MATTESON, MCLEAN, WOODARD AND HENDERSON 

inputs to these models, such as travel times on road networks and call arrival 
rates, are ad hoc. Use of inaccurate parameter estimates in these models can 
result in poor deployment decisions, leading to low performance and dimin- 
ished user confidence in the software. We introduce methods for estimating 
the demand for ambulances, that is, the total number of emergency calls per 
period, that are highly accurate, straightforward to implement, and have the 
potential to simultaneously lower operating costs while improving response 
times. 

Current practice for forecasting call arrivals is often rudimentary. For in- 
stance, to estimate the call arrival rate in a small region over a specific time 
period, for example, next Monday from 8-9 a.m., simple estimators have 
been constructed by averaging the number of calls received in the corre- 
sponding period in four previous weeks: the immediately previous two weeks 
and the current and previous weeks of the previous year. Averages of so few 
data points can produce highly noisy estimates, with resultant cost and ef- 
ficiency implications. Excessively large estimates lead to over-staffing and 
unnecessarily high costs, while low estimates lead to under-staffing and slow 
response times. Setzler, Saydam and Park (2009) document an emergency 
medical service (EMS) agency which extends this simple moving average to 
twenty previous observations: the previous four weeks from the previous five 
years. A more formal time series approach is able to account for possible 
differences from week to week and allows inclusion of neighboring hours in 
the estimate. 

We generate improved forecasts of the call-arrival volume by introducing 
an integer-valued time series model with a dynamic latent factor structure 
for the hourly call arrival rate. Day-of-week and week-of-year effects are in- 
cluded via simple constraints on the factor loadings. The factor structure 
allows for a significant reduction in the number of model parameters. Fur- 
ther, it provides a systematic approach to modeling the diurnal pattern 
observed in intraday counts. Smoothing splines are used in estimating the 
factor levels and loadings. This may introduce a small bias in some periods, 
but it offers a significant reduction in long-horizon out-of-sample forecast- 
error variance. This is combined with integer-valued time series models to 
capture residual dependence and to provide adaptive short-term forecasts. 
Our empirical results demonstrate significantly reduced error in forecasting 
hourly call-arrival volume. 

Few studies have focused specifically on EMS call arrival rates, and of 
those that have proposed methods for time series modeling, most have been 
based on Gaussian linear models. Even with a continuity correction, this 
method is highly inaccurate when the call arrival rate is low, which is typ- 
ical of EMS calls at the hourly level. Further, it conflicts with the Poisson 
distribution assumption used in operations research methods for optimizing 
staffing levels. For example, Channouf et al. (2007) forecast EMS demand 



FORECASTING EMS CALL ARRIVAL RATES 3 

by modeling the daily call arrival rate as Gaussian, with fixed day-of-week, 
month-of-year, special day effects and fixed day-month interactions. They 
also consider a Gaussian autoregressive moving-average (ARMA) model with 
seasonality and special day effects. Hourly rates are later estimated either 
by adding hour-of-day effects or assigning a multinomial distribution to the 
hourly volumes, conditional on the daily call volume estimates. 

Setzler, Saydam and Park (2009) provide a comparative study of EMS call 
volume predictions using an artificial neural network (ANN). They forecast 
at various temporal and spatial granularities with mixed results. Their ap- 
proach offered a significant improvement at low spatial granularity, even at 
the hourly level. At a high spatial granularity, the mean square forecast error 
(MSFE) of their approach did not improve over simple binning methods at 
a temporal granularity of three hours or less. 

Methods for the closely related problem of forecasting call center demand 
have received much more study. Bianchi, Jarrett and Choudary Hanumara 
(1998) and Andrews and Cunningham (1995) use ARMA models to improve 
forecasts for daily call volumes in a retail company call center and a tele- 
marketing center, respectively. A dynamic harmonic regression model for 
hourly call center demand is shown in Tych et al. (2002) to outperform sea- 
sonal ARMA models. Their approach accounts for possible nonstationary 
periodicity in a time series. The major drawback common to these studies 
is that the integer-valued observations are assumed to have a continuous 
distribution, which is problematic during periods with low arrival rates. 

The standard industry assumption is that hourly call-arrival volume has 
a Poisson distribution. The Palm-Khintchine theorem — stating that the su- 
perposition of a number of independent point processes is approximately 
Poisson — provides a theoretical basis for this assumption [see, e.g., Whitt 
(2002)]. Brown et al. (2005) provide a comprehensive analysis of operational 
data from a bank call center and thoroughly discuss validating classical 
queueing theory, including this theorem. Henderson (2005) states that we 
can expect the theorem to hold for typical EMS data because there are 
a large number of callers who can call at any time and each has a very low 
probability of doing so. 

Weinberg, Brown and Stroud (2007) use Bayesian techniques to fit a non- 
homogeneous Poisson process model for call arrivals to a commercial bank's 
call center. This approach has the advantage that forecast distributions for 
the rates and counts may be easily obtained. They incorporate smoothness 
in the within-day pattern. They implement a variance stabilizing transfor- 
mation to obtain approximate normality. This approximation is most ap- 
propriate for a Poisson process with high arrival rates, and would not be 
appropriate for our application in which very low counts are observed in 
many time periods. 
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Shen and Huang (2008b) apply the same variance stabilizing transfor- 
mation and achieve better performance than Weinberg, Brown and Stroud 
(2007). They use a singular value decomposition (SVD) to reduce the number 
of parameters in modeling arrival rates. Their approach is used for intraday 
updating and forecasts up to one day ahead. 

Shen and Huang (2008a) propose a dynamic factor model for 15-minute 
call arrivals to a bank call center. They assume that call arrivals are a Cox 
process. A Cox process [cf. Cox and Isham (1980)] is a Poisson process with 
a stochastic intensity, that is, a doubly stochastic Poisson process. The factor 
structure reduces the number of parameters by explaining the variability in 
the call arrival rate with a small number of unobserved variables. Estimation 
proceeds by iterating between an SVD and fitting Poisson generalized linear 
models to successively estimate the factors and their respective loadings. The 
intensity functions are assumed to be serially dependent. Forecasts are made 
by fitting a simple autoregressive time series model to the factor score series. 

We assume that the hourly EMS call-arrival volume has a Poisson dis- 
tribution. This allows parsimonious modeling of periods with small counts, 
conforms with the standard industry assumption, and avoids use of variance 
stabilizing transformations. We assume the intensity function is a random 
process and that it can be forecast using previous observations. This has an 
interpretation very similar to a Cox process, but is not equivalent since the 
random intensity is allowed to depend on not only its own history, but also 
on previous observations. We partition the random intensity function into 
stationary and nonstationary components. 

Section 2 describes the general problem and our data set. Section 3 
presents the proposed methodology. We consider a dynamic latent factor 
structure to model the nonstationary pattern in intraday call arrivals and 
greatly reduce the number of parameters. We include day-of-week and week- 
of-year covariates via simple constraints on the factor loadings of the nonsta- 
tionary pattern. Smoothing splines are easily incorporated into estimation 
of the proposed model to impose a smooth evolution in the factor levels 
and loadings, leading to improved long-horizon forecast performance. We 
combine the factor model with stationary integer-valued time series models 
to capture the remaining serial dependence in the intensity process. This is 
shown to further improve short-term forecast performance of our approach. 
A simple iterative algorithm for estimating the proposed model is presented. 
It can be implemented largely through existing software. Section 4 assesses 
the performance of our approach using statistical metrics and a queueing 
model simulation. Section 5 gives our concluding remarks. 

2. Notation and data description. We assume that over short, equal- 
length time intervals, for example, one hour periods, the latent call arrival 
intensity function can be well approximated as being constant, and that 
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Fig. 1. Mean number of calls per hour by day of the week. 



all data have been aggregated in time accordingly. We suppose aggregated 
call arrivals follow a nonhomogeneous counting process {Yi:tGZ}, with 
discrete time index t. Underlying this is a latent, real-valued, nonnegative 
intensity process {A( :t G Z}. We further assume that conditional on A^, Yt 
has a Poisson distribution with mean Aj. 

As shown in Figure 1, the pattern of call arrivals over the course of a typ- 
ical day has a distinct shape. After quickly increasing in the late morning, 
it peaks in the early afternoon, then slowly falls until it troughs between 
5 and 6 a.m. See Section 4 for more discussion. In our analysis, we consider 
an arrival process that has been repeatedly observed over a particular time 
span, specifically, a 24 hour day. Let 

{yt:t = l,...,n} = {yij :i = 1, . . . ,d;j = 1,. . . ,m} 

denote the sequence of call arrival counts, observed over time period t, which 
corresponds one-to-one with the jth sub-period of the ith day, so that n = 
dm. Our baseline approach is to model the arrival intensity A^ for the distinct 
shape of intraday call arrivals using a small number of smooth curves. 

We consider two disjoint information sets for predictive conditioning. Let 
J't = o'(yi, . . . , If) denote the c-field generated by Yi, . . . ,Yt, and let X = 
{xi, . . . ,x„} denote any available deterministic covariate information about 
each observation. We incorporate calendar information such as day-of-week 
and week-of-year in our analysis. We define Af as the conditional expectation 
of Yt given J-'t-i and X. We defined this above as the mean of If . In our model 
these coincide; however, this mean may not be the same as the conditional 
expectation since Af may depend on other unobserved random variables. Let 
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fit = E{Yt\^) > denote the conditional mean of Yt given only the covaria- 
tes X. Let 

(1) At = E{Yt\Tt-i,X) = fitE{Yt/fit\J't^i,X) = fitm, 

in which ijt > is referred to as the conditional intensity inflation rate 
(CIIR). By construction, 

^(r?t|X) = E{E{Yt\Tt-i,:S.)\X)/fit = E{Yt\X)/fit = 1- 

The CIIR process is intended to model any remaining serial dependence 
in call arrival counts after accounting for available covariates. In the EMS 
context we hypothesize that this dependence is due to sporadic events such 
as inclement weather or unusual traffic patterns. Since information regard- 
ing these events may not be available or predictable in general, we argue 
that an approach such as ours which explicitly models the remaining serial 
dependence will lead to improved short-term forecast accuracy. In Section 3 
we consider a dynamic latent factor model estimated with smoothing splines 
for modeling fit , various time series models for modeling rft , and finally a con- 
ditional likelihood algorithm for estimating the latent intensity process At 
via estimation of r]t given fj,f 

The call arrival data used consists of all emergency priority calls received 
by Toronto EMS between January 1, 2007 and December 31, 2008 for which 
an ambulance was dispatched. This includes some calls not requiring lights- 
and-sirens response, but does not include scheduled patient transfers. We 
include only the first call arrival time in our analysis when multiple calls are 
received for the same event. The data were processed to exclude calls with 
no reported location. These removals totaled less than 1% of the data. 

Many calls resulted in multiple ambulances being dispatched. Exploratory 
analysis revealed that the number of ambulances deployed for a single emer- 
gency did not depend on the day of the week, the week of the year, or exhibit 
any serial dependence. However, such instances were slightly more prevalent 
in the morning hours. Our analysis of hourly ambulance demand defines an 
event as a call arrival if one or more ambulances are deployed. 

We removed seven days from the analysis because there were large gaps, 
over at least two consecutive hours, in which no emergency calls were re- 
ceived. These days most likely resulted from malfunctions in the computer- 
aided dispatch system which led to failures in recording calls for extended 
periods. Strictly speaking, it is not necessary to remove the entire days; 
however, we did so since it had a negligible impact on our results and it 
greatly simplified out-of-sample forecast comparisons and implementation 
of the simulation studies in Section 4. 

Finally, we gave special consideration to holidays. We found that the 
intraday pattern on New Year's Eve and Day was fundamentally different 
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from the rest of the year and removed these days from our analysis. This 
finding is similar to the conclusions of Channouf et al. (2007) who found 
that New Year's Day and the dates of the Calgary Stampede were the only 
days requiring special consideration in their methodology when applied to 
the city of Calgary. In practice, staffing decisions for holidays require special 
planning and consideration of many additional variables. 

3. Modeling. Factor models provide a parsimonious representation of 
high dimensional data in many applied sciences, for example, econometrics 
[cf. Stock and Watson (2002)]. We combine a dynamic latent factor model 
with integer- valued time series models. We include covariates via simple 
constraints on the factor loadings. We estimate the model using smoothing 
splines to impose smooth evolution in the factor levels and loadings. The 
factor model provides a parsimonious representation of the nonstationary 
pattern in intraday call arrivals, while the time series models capture the 
remaining serial dependence in the arrival rate process. 

3.1. Dynamic latent factor model. For notational simplicity, assume m 
consecutive observations per day are available for d consecutive days with no 
omissions in the record. Let Y = {yij) denote the d x m matrix of observed 
counts for each day i over each sub-period j. Let ^ij = E{Yij\'X.), and let 
M = (/ijj) denote the corresponding d x m latent nonstationary intensity 
matrix. To reduce the dimension of the intensity matrix M, we introduce 
a if -factor model. 

We assume that the intraday pattern of expected hourly call arrivals on 
the log scale can be well approximated by a linear combination of (a small 
number) K factors or functions, denoted by ffc for k = l, . . . ,K. The factors 
are orthogonal length-r/i vectors. The intraday arrival rate model /Xj over 
a particular day i is given by 

(2) log/Xj = LiifiH l-LiK^K- 

Each of the factors f^ varies as a function over the periods within a day, but 
they are constant from one day to the next. Day-to-day changes are modeled 
by allowing the various factor loadings Lik to vary across days. When K is 
much smaller than either m or d, the dimensionality of the general problem 
is greatly reduced. In practice, K must be chosen by the practitioner; we 
provide some discussion on choosing K in Section 4. 
In matrix form we have 

(3) logM = LF^, 

in which F = (fi, . . . , fx) denotes the m, x K matrix of underlying factors 
and L denotes the corresponding d x K matrix of factor loadings, both of 
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which are assumed to have fuh column rank. Although other link functions 
are available, the component-wise log transformation implies a multiplica- 
tive structure among the K common factors and ensures a positive estimate 
of each hourly intensity /ijj. Since neither F nor L are observable, the ex- 
pression (3) is not identifiable. We further require F'^F = I to alleviate this 
ambiguity and we iteratively estimate F and L. 

3.2. Factor modeling with covariates via constraints. To further reduce 
the dimensionality, we impose a set of constraints on the factor loading 
matrix L. Let H denote a dx r full rank matrix (r < d) of given constraints 
(we discuss later what these should be for EMS). Let B denote an r x K 
matrix of unconstrained factor loadings. These unconstrained loadings B 
linearly combine to constitute the constrained factor loadings L, such that 
L = HB. Our factor model may now be written as 

logM = LF^ = HBF^. 

A considerable reduction in dimensionality occurs when r is much smaller 
than d. 

Constraints to assure identifiability are standard in factor analysis. The 
constraints we now consider incorporate auxiliary information about the 
rows and columns of the observation matrix Y to simplify estimation and 
to improve out-of-sample predictions. Similar constraints have been used in 
Takane and Hunter (2001), Tsai and Tsay (2010) and Matteson and Tsay 
(2011). 

For example, the rows of H might consist of incidence vectors for partic- 
ular days of the week, or special days which might require unique loadings 
on the common factors. We may choose to constrain all weekdays to have 
identical factor loadings and similarly constrain weekend days. However, this 
approach is much more general than simple equality constraints, as demon- 
strated below. 

The intraday pattern of hourly call arrivals varies from one day to the 
next, although the same general shape is maintained. As seen in Figure 1, 
different days of the week exhibit distinct patterns. We do not observe large 
changes from one week to the next, but there are significant changes over the 
course of the year. We allow loadings to slowly vary from week to week. Both 
of these features are incorporated into the factor loadings L by specifying 
appropriate constraints H. Let 

(4) logM = LFT = HBFT = (HW H(2) ) ("^J^J ") F^, 

in which the first term corresponds to day-of-week effects and the second 
to smoothly varying week-of-year effects. H^^^ is a d x 7 matrix in which 
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each row H- is an incidence vector for the day-of-week. Similarly, H*-^-* 

(2) 

is a (i X 53 matrix in which each row H^ is an incidence vector for the 
week-of-year. (We use a 53 week year since the first and last weeks may 
have fewer than 7 days.) The 7 x K matrix B^^^ = (b^ , . • • ^^k ) contains 
unconstrained factor loadings for the day-of-week and B^^^ = (b^ , • • • , b)^ ) 
is a 53 X i^ matrix of factor loadings for the week-of-year. 

3.3. Factor model estimation via smoothing splines. We assume that as 
the nonstationary intensity process fiij varies over the hours j of each day i, 
it does so smoothly. If each of the common factors f^ € M™ varies smoothly 
over sub-periods j, then the smoothness of fiij is guaranteed for each day. 
Increasing the number of factors reduces possible discontinuities between 
the end of one day and the beginning of the next. To incorporate smooth- 
ness into the model (2), we use Generalized Additive Models (GAMs) in the 
estimation of the common factors ffc . GAMs extend generalized linear mod- 
els, allowing for more complicated relationships between the response and 
predictors by modeling some predictors nonparametrically [see, e.g., Hastie 
and Tibshirani (1990); Wood (2006)]. GAMs have been successfully used 
for count-valued data in the study of fish populations [cf. Borchers et al. 
(1997); Daskalov (1999)]. The factors f^ = fk{j) are a smooth function of 
the intraday time index covariate j. The loadings L are defined as before. 
If the loadings L were known covariates, equation (2) would be a varying 
coefficient model [cf. Hastie and Tibshirani (1993)]. 

There are several excellent libraries available in the statistical package R 
[R Development Core Team (2009)] for fitting GAMs, thus making them 
quite easy to implement. We used the gam function from the mgcv library 
[Wood (2008)] extensively. Other popular libraries include the gam package 
[Hastie (2009)] and the gss package [Gu (2010)]. See Wood [(2006), Sec- 
tion 5.6] for an introduction to GAM estimation using R. 

In estimation of the model (2) via the gam function, we have used thin 
plate regression splines with a ten-dimensional basis, the Poisson family, and 
the log-link function. Thin plate regression splines are a low rank, isotropic 
smoother with many desirable properties. For example, no decisions on the 
placement of knots is needed. They are an optimal approximation to thin 
plate splines and, with the use of Lanczos iteration, they can be fit quickly 
even for large data sets [cf. Wood (2003)]. 

When the factors F are treated as a fixed covariate, the factor model 
can again be interpreted as a varying coefficient model. Given the calendar 
covariates X, let 

log Hij = FjiLji + ■■■ + FjxL^i 
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fe=l 

K 



5^F,fc{6(^)(x,) + &l')(x,)}, 



in which b\. (xj) is a piece- wise constant function of the day-of-week, and 

hj^ (xj) is a smoothly varying function over the week-of-year. We may again 
proceed with estimation via the gam function in R. Day-of-week covariates 
are simply added to the linear predictor as indicator variables. These rep- 
resent a level shift in the daily loadings on each of the factors f^. In our 
application it is appropriate to assume a smooth transition between the last 
week of one year and the first week of the next. To ensure this in estimation 
of 5^ (xj), we use a cyclic cubic regression spline for the basis [cf. Wood 
(2006), Section 5.1]. Iterative estimation of F, and L via B, for a given 
number of factors K is discussed in Section 3.5. 

We allow the degree of smoothness for the factors fk and the loadings 
function h\. (xj) to be automatically estimated by generalized cross valida- 
tion (GVC). We expect short term serial dependence in the residuals for 
our application. For smoothing methods in general, if autocorrelation be- 
tween the residuals is ignored, automatic smoothing parameter selection 
may break down [see, e.g., Opsomer, Wang and Yang (2001)]. The proposed 
factor model may be susceptible to this if the number of days included is not 
sufficiently large compared to the number of smooth factors and loadings, or 
if the residuals are long-range dependent. We use what is referred to as a per- 
formance iteration [cf. Gu (1992)] versus an outer iteration strategy which 
requires repeated estimation for many trial sets of the smoothing parameters. 
The performance iteration strategy is much more computationally efficient 
for use in the proposed algorithm, but convergence is not guaranteed, in 
general. In particular, cycling between pairs of smoothing parameters and 
coefficient estimates may occur [cf. Wood (2006), Section 4.5], especially 
when the number of factors K is large. 

3.4. Adaptive forecasting with time series models. Let ej = Yt/fit denote 
the multiplicative residual in period t implied by the fitted values fit from 
a factor model estimated as described in the previous sections. Time series 
plots of this residual process appear stationary, but exhibit some serial de- 
pendence. In this section we consider time series models for the latent CIIR 
process rjt = E{Yt/fit\J^t-i,^) to account for this dependence. 

To investigate the nature of the serial dependence, we study the bivari- 
ate relationship between the et process versus several lagged values of the 
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process et-i- Scatterplots reveal a roughly linear relationship. Residual au- 
tocorrelation and partial autocorrelation plots for one of the factor models 
fit in Section 4 are given in Figure 5(b) and (c). These quantify the strength 
of the linear relationship as the lag I increases. It appears to persist for many 
periods, with an approximately geometric rate of decay as the lag increases. 
To explain this serial dependence, we first consider a generalized autore- 
gressive linear model, defined by the recursion 

(6) r]t = uj + Qet_i + I3r]t-i ■ 

To ensure positivity, we restrict w > and a, /3 > 0. When ^uj is constant, the 
resulting model for Yt is an Integer-GARCH(1, 1) (IntGARCH) model [e.g., 
Ferland, Latour and Oraichi (2006)]. It is worth noting some properties 
of this model for the constant /xj case. To ensure the stationarity of %, 
we further require that a + (3 < 1. This sum determines the persistence of 
the process, with larger values of a leading to more adaptability. When 
this stationarity condition is satisfied, and rjt has reached its stationary 
distribution, the expectation of rjt given X is 

^(77t|X)=c^/(l-a-/3). 

To ensure E{r]t\^) = 1 for the fitted model, we may parameterize u = 1 — 
a — (5. This constraint is simple enough to enforce for the model (6) and 
we do so. However, additional parameter constraints such as this may make 
numerical estimation intractable in more complicated models and they are 
not enforced by us in the models outlined below. 

When lit is a nonstationary process, the conditional intensity 

At = ^i-tilt 

is also nonstationary. Since £'(r/t|X) = 1, we interpret r/j as the stationary 
multiplicative deviation, or inflation rate, between At and ^f The At process 
is mean reverting to the /Uf process. Let 

et = YtlX 

denote the multiplicative standardized residual process given an estimated 
CIIR process rit. If a fitted model defined by (6) sufficiently explains the 
observed linear dependence in et, then an autocorrelation plot of et should 
be statistically insignificant for all lags i. As a preview, the standardized 
residual autocorrelation plot for one such model fit in Section 4 is given in 
Figure 5(d). The serial correlation appears to have been adequately removed. 
Next, we formulate three different nonlinear generalizations of (6) that 
may better characterize the serial dependence, and possibly lead to improved 
forecasts. The first is an exponential autoregressive model defined as 

(7) T]t = aet-i + [/? + 6exp{-jr]t_i)]vt-i, 
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in which a,/3,(5, 7 > 0. Exponential autoregressive models are attractive in 
application because of their threshold-like behavior. For large 7?j-i, the func- 
tional coefficient for rjt^i is approximately /3, and for small 774-1 it is approx- 
imately (3 + 6. Additionally, the transition between these regimes remains 
smooth. As in Fokianos, Rahbek and Tj0stheim (2009), for a + /3 < 1 one 
can verify the r]t process has a stationarity version when fit is constant. 
We also consider a piecewise linear threshold model 

(8) r]t = uj + aet-i + /37?t_i + {-fCt-i + 6r]t~i)I{et_-,(^{cuC2)}^ 

in which I is an indicator variable and the threshold boundaries satisfy 
< ci < 1 < C2 < 00. To ensure positivity of r]t, we assume w, a, /3 > 0, (a + 
7) > 0, and {13 + 6) > 0. Additionally, we take 6 <0 and 7 > 0, such that 
when et-i is outside the range (ci, C2) the CIIR process rjt is more adaptive, 
that is, puts more weight on ej_i and less on rjt-i- When /ij is constant, the 
r]t process has a stationary version under the restriction a + /3 + ^ + 5< 1; 
see Woodard, Matteson and Henderson (2010). In practice, the threshold 
boundaries ci and C2 are fixed during estimation, and may be adjusted 
as necessary after further exploratory analysis. We chose ci = 1/1.15 and 
C2 = 1.15, that is, thresholds at 15% above and below 1. 

Finally, we consider a model with regime switching at deterministic times, 
letting 

rjt = (tJi + aiet^i + (3i'nt~i)I{te{ti .ta)} + (^2 + "264-1 + /32r]t^i)I{t^i^ti.t2)}- 
(9)_ 

This model is appropriate assuming the residual process has two distinct 
regimes for different periods of the day. For example, one regime could be 
for normal workday hours with the other regime being for the evening and 
early morning hours. No stationarity is possible for this model. A drawback 
of this model is that the process has jumps at ti and t2- As was the case 
for ci and C2 in (8), ti and ^2 are fixed during estimation. After exploratory 
analysis, we chose ti = 10 a.m. and ^2 = 4 p.m. 

3.5. Estimation algorithm. The estimation procedure below begins with 
an iterative algorithm for estimating the factor model from Sections 3.1-3.3 
through repeated use of the gam, function from the mgcv library in R. Any 
serial dependence is ignored during estimation of nt for simplicity. Given 
estimates for the factor model fit, conditional maximum likelihood is used 
to estimate the conditional intensity Xt via one of the time series models 
given in (6)-(9) for the CIIR process rjt. 

1. Initialization: 

(a) Fix K and H. 

(b) Choose some c G (0, 1) and define Yc = {yij V c). 
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(c) Apply a singular value decomposition (SVD) to find log(Yc) = 
UoDoVT. 

(i) Let Uq ■ denote the first K columns of the left singular matrix 
Uo. 

(ii) Let Vq ' denote the first K columns of the right singular ma- 
trix Vq. 

(iii) Let Dq ' denote the upper- left K x K sub-matrix of Dq, the 
diagonal matrix of singular values. 

(d) Assign Lo = v'i'-^^-D^o''^^ and Fq = Y^^'^K 

No smoothing is performed and the constraints H are omitted in 
initialization. 

2. Update: 

(a) Fit the Poisson GAM model described in Section 3.3 with F = F„ 
and H as fixed covariates. 

• Assign B^* as the estimated parameter values from this fit and let 
L„* = HB„* . 

(b) Fit the Poisson GAM model described in Section 3.3 with L = L„* 
as a fixed covariate. 

• Assign F„* as the estimated parameter values from this fit. 

(c) Apply an SVD to find B^.F^ = U^+iD^+iV^^i. 

(i) Assign B„+i = ul^^)Di^f). 

(ii) Assign F„+i=vi^;f\ 
(iii) Assign L„+i = HB„+i. 

(d) Let logM„+i = L„+iFT^i. 

3. Repeat the update steps recursively until convergence. 

Convergence is reached when the relative change in M is sufficiently small. 
After convergence we can recover log /2t from the rows of the final estimate 
of logM. These values are then treated as fixed constants during estimation 
of rjt. We use conditional maximum likelihood to estimate the parameters 
(w, a,/3, . . .) associated with a time series model for rjt. The recursion defined 
by (6)-(9) requires initialization by choosing a value for rji; the estimates 
are conditional on the chosen initialization. 

We may always specify the joint distribution Py of the observations Y as 
an iterated product of successive conditional distributions Py^ for Yj given 

(yi_i,...,yi)as 

T 

PY{yT,yT-i, ■ ■ ■ ,y2,yi) = PYAyi)Y\.PYt{yt\yt-i, ■ ■ ■ ,yi)- 

t=2 
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We follow the standard convention of fixing Py-^(yi) = 1 in estimation. For 
large sample sizes the practical impact of this decision is negligible. We may 
therefore write the log likelihood function as the sum of iterated conditional 
log likelihood functions. The conditional distribution for the observations is 
assumed to be Poisson with mean At = ^tr/^. 

For uninterrupted observations over periods 1 , . . . , T, we define the log 
likelihood function as 



e{uj,a,f3,...\M,Y,r]i) = ^£ti^,a,^,. . .\yt,yt-i,11t,Jit-i,r]t-i) 

t=2 
T 

(10) =Y,iyt^ogXt-\t-logyt\) 



t=2 



"^{ytlogCfltTit) - Jitvt - logy*! 



t=2 

This recursion requires an initial value for rji. For simplicity, we use its ex- 
pected value, rji = 1. When there are gaps in the observation record, equa- 
tion (10) is calculated over every contiguous block of observations. This 
requires reinitialization of r/j = 1 at the beginning of each block. The log 
likelihood for the blocks are then added together to form the entire log like- 
lihood. The maximum likelihood estimate is the argmax of this quantity, 
subject to the constraints given in Section 3.4. Finally, rjt is estimated by 
the respective recursion given by equations (6)-(9) with parameters replaced 
by their estimates, again with reinitialization of r^t = 1 at the beginning of 
each contiguous block of observations. Blocks were large enough in our ap- 
plication that the effect of reinitialization was negligible. 

4. Empirical analysis. Using the data described in Section 2, we per- 
form the following analysis: (a) we define various statistical goodness-of-fit 
metrics suitable for the proposed models; based on in-sample performance, 
these metrics are used to determine the number of factors K for use in the 
dynamic factor models, (b) We compare the out-of-sample forecast perfor- 
mance for the factor model in (3), the factor model with constraints in (4), 
and the factor model with constraints and smoothing splines in (5). These 
comparisons help ascertain the improvement from each refinement and vali- 
date the proposed selection methods for K. (c) For the latter factor model, 
we compare the out-of-sample forecast performance with the addition of the 
CIIR process, via use of the various time series models defined in Section 3.4. 
(d) We quantify the practical impact of these successive statistical improve- 
ments with a queueing application constructed to approximate ambulance 
operations. 
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4.1. Interpreting the fitted model. The mean number of calls was approx- 
imately 24 per hour for 2007 and 2008, and no increasing or decreasing linear 
trend in time was detected during this period. We partition the observations 
by year into two data sets referred to as 2007 and 2008, respectively. Each 
year is first regarded as a training set, and each model is fit individually to 
each year. The opposite year is subsequently used as a test set to evaluate 
the out-of-sample performance of each fitted model. To account for missing 
days, we reinitialize the CIIR process r]t in the first period following each 
day of missing data. This was necessary at most five times per year including 
the first day of the year. 

We found the factor model fit with constraints, smoothing splines, and 
K = 4: factors to be the most appropriate of the factor models considered. 
The estimated factors f^ for 2008 are shown in Figure 2(a). Each of the 
four factors varies smoothly over the hours of the day via use of smooth- 
ing splines. The first factor fi is strictly positive and the least variable. It 
appears to capture the mean diurnal pattern. The factor {2 appears to iso- 
late the dominant relative differences between weekdays and weekend days. 
The defining feature of £3 and £4 is the large increase late in the day, cor- 
responding closely to the relative increase observed on Friday and Saturday 
evenings. However, £3 decreases in the morning, while £4 increases in the 
morning and decreases in the late afternoon. As K increases, additional fac- 
tors become increasingly more variable over the hours of the day. Too many 
factors result in overfitting the model, as the extra factors capture noise. 

The corresponding daily factor loadings L for the first four weeks of 2008 
are shown in Figure 2(b). The loadings (Li — 14.5) are shown to simplify 
comparisons. The much higher loadings on fi confirm its interpretation as 
capturing the mean. The peaks on Fridays coincide with Friday having the 
highest average number of calls, as seen in Figure 1. Weekdays get a pos- 
itive loading on £2, while weekend days get negative loading. Loadings on 
£3 are lowest on Sundays and Mondays and loadings on £4 are largest on 
Fridays and Saturdays. As K increases, the loadings on additional factors 
become increasingly close to zero. This partially mitigates the overfitting 
described above. Factors with loadings close to zero have less impact on the 
fitted values Jit- Nevertheless, they can still reduce out-of-sample forecast 
performance. 

The daily factor loadings for all of 2008 are shown in Figure 2(c). The 
relative magnitude of each loading vector with respect to day-of-week is 
constant. This results from use of the constraint matrix H^^-* in (4). As the 
loadings vary over the days of the week, they also vary smoothly over the 
course of the year, via use of the constraint matrix H^^' and the use of cyclic 
smoothing splines in estimation of B'^' in (4). The loadings on fi show how 
the expected number of calls per day varies over the year. The week to week 
variability in the other loadings influences how the days of the week change 
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Fig. 2. 2008 fitted (a) factor levels fk (log-linear scale) and [{h) and (c)] corresponding factor loadings Lfc. (log-linear scale) for a factor 
model fit with constraints, smoothing splines and K = 4 factors. (Li. — 14.5) is shown for easier comparison. 
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Fig. 3. The estimated intensity process p.^, for every day in 2008, for a factor model fit 
with constraints, smoothing splines and K = A factors, colored by day-of-week, and shaded 
light to dark by week- of -year. 

relative to each other over the year. Figure 3 shows the estimated intensity 
process /Xj for every day in 2008, shaded by day-of-week. The curves vary 
smoothly over the hours of the day. The fit for each day of the week keeps 
the same relative shape, but it varies smoothly over the weeks of the year. 
Section 3.4 described incorporating time series models to improve the 
short-term forecasts of a factor model. The models capture the observed 
serial dependence in the multiplicative residuals from a fitted factor model; 
see Figure 5. Parameter estimates and approxiixiate standard errors for the 
IntGARCH model are given in Supplemental material (Table 1). A fitted 
factor model /2j using constraints, smoothing splines and K = A, as well as 
the factor model including a fitted IntGARCH(l, 1) model A^, are also shown 
in Figure 6(a), with the observed call arrivals per hour for Weeks 8 and 9 

of 2007. The A( process is mean reverting about the lit process. They are 
typically close to each other, but when they differ by a larger amount, they 
tend to differ for several hours at a time. The corresponding fitted CIIR pro- 
cess % is shown in Figure 6(b). This clearly illustrates the dependence and 
persistence exhibited in Figure 6(a). The CIIR process ranges between ±6% 
during this period. With a mean of 24 calls per hour, this range corresponds 
to At varying about /2t by about ±1.5 expected calls per hour. 



4.2. Goodness of fit and model selection. To evaluate the fitted values 
and forecasts of the proposed models, three types of residuals are computed: 
multiplicative, Pearson and Anscombe. Their respective formulas for the 
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Poisson distribution are given by 

yt , ^ yt-\t ^ (3/2)(y,'/3-A?/3) 
n/,t = — -1, rp, = ^, rA,t- 



^* a/a. ^t 



We refer to the root mean square error (RMSE) of each metric as RMSME, 
RMSPE and RMSAE, respectively. The multiphcative residual is defined as 
before and is a natural choice given the definition for the CIIR. Since the 
variance of a Poisson random variable is equal to its mean, the Pearson resid- 
ual is quite standard. However, the Pearson residual can be quite skewed for 
the Poisson distribution [cf. McCullagh and Nelder (1989), Section 2.4]. The 
Anscombe residual is derived as a transformation that makes the distribu- 
tion of the residuals as close to Gaussian as possible while suitably scaling to 
stabilize the variance. See Pierce and Schafer (1986) for further discussion 
of residuals for generalized linear models. While the three methods always 
yielded the same conclusion, we found use of the Anscombe residuals gave 
a more robust assessment of model accuracy and simplified paired compar- 
isons between the residuals of competing models. 

The three RMSE metrics were used for both in- and out-of-sample model 
comparisons. For in-sample comparisons of the factor models, we also com- 
puted the deviance of each fitted model //j. As a goodness-of-fit metric, 
deviance is derived from the logarithm of a ratio of likelihoods. For a log 
likelihood function £(/x|Y), it is defined as 

-2{^(/^ = /2|Y)-£(/x = Y|Y)}, 

in general. For a fitted factor model, ignoring serial dependence, the deviance 
corresponding to a Poisson distribution is 



2 '^{yt'^og{yt/V't) - {yt - V-t)}, 



t=i 

in which the first term is zero if y^ = 0. 

We compare the fitted models' relative reduction in deviance and RMSE 
as we increase the number of factors K. Figure 4 shows these results for 
factor models fit to 2007 data with constraints and smoothing splines. The 
results for other models and for 2008 were very similar. This plot may be 
interpreted similarly to a scree plot in PCA by identifying the point at 
which performance tapers off and the marginal improvement from additional 
factors is negligible. Under each scenario we consistently selected K = 4 
factors through this graphical criterion. To further justify this as a factor 
selection strategy, we also consider the impact the number of factors K has 
on out-of-sample performance for each of the proposed models below. This 
approach is straightforward, but it does not fully account for the uncertainty 
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Fig. 4. 2007 percentage m-sample relative goodness-of-fit improvement by addition of 
one factor (K — > A' + IJ for a factor model fit with constraints and smoothing splines. 

on the number of factors. Bayesian estimation would require specialized 
computation, but it may improve model assessment [see, e.g.. Lopes and 
West (2004)]. 

4.3. Out-of-sample forecast performance. Out-of-sample comparisons we- 
re made by fitting models to the 2007 training set and forecasting on the 
2008 test set, and vice versa. To make predictions comparable from one 
year to the next, we align corresponding calendar weeks of the year, not 
days of the year. This ensures that estimates for Sundays are appropriately 
compared to Sundays, etc. 

The first model considered was the simple prediction (SP) method. This 
simple moving average involving four observations was defined in the 
Introduction. Next, the forecasts of various factor models (FM) were con- 
sidered. For K = 1,. . . ,6, we evaluated the forecasts from the FM in (3), 
the FM with constraints in (4), and the FM with constraints and smoothing 
splines in (5). Finally, for the latter FM, with K = 4, we calculate the im- 
plied fit from the training set with the inclusion of the CIIR process via the 
various time series models defined in Section 3.4. We compute the forecast 
RMSE of each model for the three residual types, for both years. 

The forecast results are shown in Table 1. The basic FMs did slightly worse 
than the SP both years. With only one year of observations, these FMs tend 
to overfit the training set data, even with a small number of factors. The 
FMs with constraints give a very significant improvement over the previous 
models. The forecast RMSE is lowest aX K = A for the 2007 test set, and at 



Table 1 
Root mean square multiplicative, Pearson, and Anscombe errors for fitting model to 2007 and forecasting 2008, and vice versa 
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Constraint 


Smoothing 


2007 model, 2008 residuals 


2008 model, 2007 residuals 


Model 


RMSME 


RMSPE 


RMSAE 


RMSME 


RMSPE 


RMSAE 


Simple prediction 


NA 


NA 


0.2696 


1.1955 


1.1849 


0.2661 


1.1902 


1.1925 


Factor model 


K = l 


No 


No 


0.2722 


1.2369 


1.2237 


0.2657 


1.2183 


1.2263 


Factor model 


K = 2 


No 


No 


0.2721 


1.2357 


1.2225 


0.2661 


1.2197 


1.2277 


Factor model 


K = 3 


No 


No 


0.2727 


1.2374 


1.2239 


0.2659 


1.2182 


1.2262 


Factor model 


K = 4 


No 


No 


0.2729 


1.2383 


1.2249 


0.2666 


1.2206 


1.2283 


Factor model 


K = 5 
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1.2395 
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Factor model 


K = 6 
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0.2436 
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K = l 
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A Yes in the constraints column implies that the factor model was fit using the constraints outlined in Section 3.2. A Yes in the smoothing 
column indicates that the model was fit using smoothing splines as described in Section 3.3. 
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Fig. 5. (a) Sample autocorrelation function for hourly call arrival counts yt- Residual 
Ct = yt/fit (b) autocorrelation and (c) partial autocorrelation functions for fitted factor 
model fit with fe = 4 factors using factor and loading constraints and smoothing splines. 
(d) Standardized residual e't — yt/Xt = yt/CPtfjt) autocorrelation function for fitted fac- 
tor model with fitted IntGARCH(l, 1) model for r]t. Dashed lines give approximate 95% 
confidence levels. 



K = 3 for the 2008 test set. There was also a very large decrease between 
K = 1 and K = 2. The FMs with constraints and smoothing splines offered 
an additional improvement. The forecast RMSE is lowest at if = 4 for both 
test sets. With the addition of the IntGARCH model for the CIIR process 
to this model, the RMSE improved again. Application of the nonlinear time 
series models instead offered only a slight improvement over the IntGARCH 
model. 

With only one year of training data, each FM begins to overfit with K = 5 
factors. Results were largely consistent regardless of the residual used, but 
the Anscombe residuals were the least skewed and allowed the simplest 
pairwise comparisons. Although the FMs with constraints had superior in- 
sample performance, the use of smoothing splines reduced the tendency to 
over-fit and resulted in improved forecast performance. The CIIR process 
offered improvements in fit over FMs alone. 
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Fig. 6. Weeks 8 and 9 of 2007: (a) observed call arrivals per hour yt, fitted K = A 
dynamic factor model ^t using constraints and smoothing splines, and factor model Xt 
including fitted IntGARCH(l, 1); (b) the fitted conditional intensity inflation process fjt 
from the IntGARCH(l, 1) model. 

We also fit each of the nonhnear time series models discussed in Sec- 
tion 3.4 using a FM with K = 4:. The regime switching model had the best 
performance. It had the lowest RMSE for both test sets. The exponential 
autoregressive and the piecewise linear threshold models performed similarly 
to the IntGARCH model for both test sets. Although the nonlinear models 
consistently performed better in-sample, their out-of-sample performance 
was similar to the IntGARCH model. 



4.4. Queueing model simulation to approximate amhulance operations. 
To comprehensively improve ambulance operations, it would be advanta- 
geous to simultaneously model the service duration of dispatched ambu- 
lances in addition to the demand for ambulance service. Unfortunately, such 
information was not available. We are currently working with Toronto EMS 
to use our improved estimates of call arrival rates to improve staffing in their 
dispatch call center. Extending our approach to a spatial-temporal forecast- 
ing model will likely be used to help determine when and where to deploy 
ambulances. 

We present a simulation study that uses a simple queueing system to quan- 
tify the impact that improved forecasts have on staffing decisions and relative 
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operating costs, for the Toronto data. The queueing model is a simpUfication 
of ambulance operations that ignores the spatial component. Similar queue- 
ing models have been used frequently in EMS modeling [see Swersey (1994), 
page 173]. This goodness-of-fit measure facilitates model comparisons and 
a similar approach may be useful in other contexts. 

We use the terminology employed in the call center and queueing theory 
literature throughout the section; for our application, servers are a proxy 
for ambulances, callers or customers are those requiring EMS, and a server 
completing service is equated to an ambulance completing transport of a per- 
son to a hospital, etc. As before, let yt denote the observed number of call 
arrivals during hour t. Our experiment examines the behavior of a simple 
M/M/s queueing system. The arrival rate in time period t is At. During 
this period, let sj denote the number of servers at hand. For simplicity, we 
assume that the service rate z/ for each server is the same, and constant over 
time. Furthermore, intra-hour arrivals occur according to a Poisson process 
with rate At , and service times of callers are independent and exponentially 
distributed with rate v. 

As in Section 4.3, models are calibrated on one year of observations and 
forecasts for At are made for the other year. Each model's forecasts At are 
then used to determine corresponding staffing levels St for the system. 

To facilitate comparisons of short-term forecasts, we assume that the num- 
ber of servers can be changed instantaneously at the beginning of each pe- 
riod. In practice, it is possible to adjust the number of ambulances in real 
time, but not to the degree that we assume here. 

Each call has an associated arrival time and service time. When a call 
arrives, the caller goes immediately into service if a server is available, oth- 
erwise it is added to the end of the queue. A common goal in EMS is to ensure 
that a certain proportion of calls are reached by an ambulance within a pre- 
specified amount of time. We approximate this goal by instead aiming to an- 
swer a proportion, 9, of calls immediately; this is a standard approximation 
in queueing applications in many areas including EMS [Kolesar and Green 
(1998)]. For each call arrival, we note whether or not the caller was served 
immediately. As servers complete service, they immediately begin serving 
the first caller waiting in the queue, otherwise they await new arrivals if the 
queue is currently empty. One simulation replication of the queueing system 
simulates all calls in the test year. 

To implement the queueing system simulation, it is first necessary to 
simulate arrival and service times for each caller in the forecast period. 
We use the observed number of calls for each hour yt as the number of 
arrivals to the system in period t. Since arrivals to the system are assumed 
to follow a Poisson process, we determine the yt call arrival times using the 
well-known result that, conditional on the total number of arrivals in the 
period [t,t + 1], the arrival times have the same distribution as the order 
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statistics of yt independent Uniforni(i,t + 1) random variables. We exploit 
this relationship to generate the intra-hour arrival times given the observed 
arrival volume yt- The service times for each call are generated independently 
with an Exponential(i^) distribution. 

The final input is the initial state of the queue within the system. We 
generate an initial number of callers in the queue as Poisson(yi), then in- 
dependently generate corresponding Exponential(i^) residual service times 
for each of these callers. This initialization is motivated through an infinite- 
server model; see, for example, Kolesar and Green (1998). Whenever there 
is a missing day, in either the test set or corresponding training set period, 
we similarly reinitialize the state of the queue but with yi replaced by the 
number of calls observed in the first period following the missing period. 
These initializations are common across the different forecasting methods to 
allow direct comparisons. 

To evaluate forecast performance, we define a cost function and an ap- 
propriate method for determining server levels from arrival rate estimates. 
Let nt denote the number of callers served immediately in period t. The 
hourly cost function is given by 

C{nt, yt, St) = Pen(nt, yt) + st, 
in which 

Penfn ) = I ^' if ^t > ^Vu 

\ t,ytj \^q{yt — nt), otherwise, 

6 € (0,1) is the targeted proportion of calls served immediately, and q >0 
is the cost of not immediately serving a customer, relative to the cost of 
staffing one server for one hour. The total cost, with respect to the hourly 
server cost, for the entire forecast period is 

C = ^ C{7it, yt, St) = ^ Pen(nt, yt) + ^ st- 

t t t 

This approach, where penalties for poor service are balanced against staffing 
costs, is frequently used; see, e.g., Andrews and Parsons (1993), Harrison, 
Zeevi and Shum (2005). 

At time t — 1, the number of call arrivals and the number served immedi- 
ately are random variables, denoted as Yt and Nt, respectively. A natural ob- 
jective is to choose staffing levels that minimize the hourly expected cost as 

(11) st = s.Tgm:inE{C{Nt,Yt,st)\Ft-i,^}, 

steN 

in which Yt is assumed to have a Poisson distribution with mean equal to 
the arrival rate forecast A(. The staffing levels are then a function of arrival 
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rate forecasts, s'i(At). We approximate this expectation numerically by ran- 
domly generating J independent realizations as Ytj ~ Poisson(At). Then, for 
each Ytj we simulate one independent realization of Nf. For a fixed value 
of St the expectation is approximated by J~^ Si=i{P6n(A'^tj, Itj) + st}. We 
found that J = 25,000 provided adequate accuracy. 

Independent realizations of Nt\Yt require running the queueing system 
forward one hour, but this is very computationally intensive. To approxi- 
mate Nt\Yt, we use a Binomial distribution. Let Ntj\Ytj -^ Binomialjyij, 
5(Aj,sj, i^)}. The function g gives the steady state probability that a cus- 
tomer is served immediately for a queueing system with a constant arrival 
rate, server level and service rate, At,sj and z^, respectively. Derivation of 
this function is available in any standard text on queueing theory [e.g., Gross 
and Harris (1998), Chapter 2]. 

Let Pi denote the long run proportion of time such a system contains i 
customers and let p = \/{vs). Then 

lo, ifp>l, 

c-l 



u=0 



in which Pn = > —: -\ — r, r for p < 1. 

" ^ u! c\il-p) 



When /9 > 1, the arrival rate is faster than the net service rate, and the sys- 
tem is unstable; the long run probability that a customer is served immedi- 
ately is zero. The binomial approximation greatly reduces the computational 
costs and provides reasonable results, though it tends to underestimate the 
true variability of Nt\Yt due to the positive correlation in successive caller 
delays. 

A final deliberation is needed on the removal of servers when sj decreases. 
In our implementation, idle servers were removed first, and, if necessary, busy 
servers were dropped in ascending order with respect to remaining service 
time. We also considered random selection of servers to be dropped. Doing 
so produced highly variable results, and is under further study. To further 
simplify the implementation, if it was necessary to drop a busy server, it 
was simply discarded, along with any remaining service time for that caller. 
The effect of this simplification depends on the service rate v; our results 
did not appear to be sensitive to this simplification. 

Simulation of the queueing system is now rather straightforward. On 
each iteration i, we note whether each caller was served immediately or 
not. Forecast performance is assessed by examining the total cost C^^' = 
J2t^i''^t ^yt^^t) over the test period. For both years, we performed 100 
simulations over the test year for each forecast method. To demonstrate the 
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Fig. 7. Mean total per period cost over 100 simulations for different forecasting methods 
and different values of q, v and 6. Plots (a)-(d) use the 2008 test set and plots (e)-(h) 
use 2007 as the test set. The vertical lines represent ±1 standard deviation. 



robustness of this methodology, we performed the experiment for several 
different values of the queuing system's parameters. Specifically, all combi- 
nations of g G {2, 5, 10}, u € {1, |}, and 9 G {0.8, 0.9} were considered, after 
consultation with EMS experts. 

Results for the mean hourly cost over the 100 simulations for each fore- 
casting method, for each test year, are summarized in Figure 7. We see that 
the mean hourly cost is lowest for the FM w/ IntGARCH, followed by the 
FM only, and finally by SP. All pairwise differences in mean were highly 
significant; the smallest i-ratio was 80. In fact, this ordering in performance 
held for almost every iteration of the queueing system, not just on average. 

The mean percentage of callers served immediately can be found in Fig- 
ure 8. The total number of server hours ^ st used was also recorded for 
each model for each set of parameter values. A table containing the values 
of all these quantities can be found in the online supplemental material. 
Both mean percentage served immediately and mean hourly cost increase 
with q. For each test year, for each level of {q, u, 9), ^^ sj differed by between 
one and three thousand server-hours, for the different models. 
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Fig. 8. Mean percentage served immediately for the entire test set over 100 simulations 
for different forecasting methods and different values of q, v and 6. Plots (a)-(d) use the 
2008 test set and plots (e)-(h) use 2007 as the test set. The vertical lines represent ±1 
standard deviation. 



5. Conclusions. Our analysis was motivated by a data set provided by 
Toronto EMS. The proposed forecasting method aUows parsimonious mod- 
ehng of the dependent and nonstationary count-valued EMS call arrival 
process. Our method is straightforward to implement and demonstrates sub- 
stantial improvements in forecast performance relative to simpler forecast- 
ing methods. We measured the impact of our successive refinements to the 
model, showing the merit of factor model estimation with covariates and 
smoothing splines. The factor model was able to capture the nonstationary 
behavior exhibited in call arrivals. Introduction of the CIIR process allowed 
adaptive forecasts of deviations from this diurnal pattern. 

Assessing the impact that different arrival rate forecasts can have on call 
centers and related applications has received very little attention in the 
literature. Our data-based simulation approach is straightforward to imple- 
ment, and was able to clearly distinguish the effectiveness of each forecasting 
method. The simulation results coincide with the out-of-sample RMSE anal- 
ysis in Section 4.3 and provide a practical measure of forecast performance. 
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Relative operating cost is a natural metric for measuring call arrival rate 
forecasts, and our implementation may easily be extended to many cus- 
tomized cost functions and a wide variety of applications. 

Ultimately, we seek to strengthen emergency medical service by improving 
upon relevant statistical methodology. Future work will consider inclusion 
of additional covariates and study of other nonlinear time series models. 
Bayesian methods which directly model count-valued observations have de- 
sirable properties for inference and many applications, and are under study. 
Spatial and spatial-temporal analysis of call arrivals will also offer new ben- 
efits to EMS. 

Acknowledgments. The authors sincerely thank Toronto EMS for shar- 
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SUPPLEMENTARY MATERIAL 

Supplement A: Additional tables (DOL 10. 1214/10- AOAS442SUPPA; .pdf). 
Tables 1 and 2. 

Supplement B: Estimation algorithms (DOL 10. 1214/10- AOAS442SUPPB; 
.R). R code for estimating the models in Section 3 and for calculating the 
RMSE metrics in Section 4. 

Supplement C: Simulation algorithms (DOL 10.1214/10-AOAS442SUPPC; 
.R). R code for implementing the queueing model simulation in Section 4.4. 
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